The Turing Way - Code Styling and Linting

Amelia Edmondson-Stait & The Turing Way Community

29th Apr, 2025

The Turing Way - Code Styling and Linting

  • Have you ever opened a syntax or script file two years after running an analysis only to find that you have no immediate memory of the code?
  • Have you received analysis files from a collaborator, or downloaded them from an online repository that you have never used before?

  • Now imagine that these files are very hard to read, or there are lots of variables being passed to arcane functions, or worse, you can’t find useful code as they are saved with meaningless file names such as:
    • analysis_1final_FINAL.R
    • onlyusethisoneforanalysis_onamonday2a.py.

  • If you have not - then you are one of the lucky ones! But if you have experienced it then you might know how frustrating it is to work with those files.

  • This chapter will highlight ways to avoid such challenges in your projects by introducing some principals of ‘code hygiene’, otherwise known as linting.

illustration by Scriberia

illustration by Scriberia

Overview


By keeping the following advice in mind while coding, your code will be more reusable, adaptable, and clear.

The Zen of Python, by Tim Peters



Guidelines for Code Styling

  • Style guidelines differ between organisations, languages, and over time.
  • Even, the Python style guide Python Enhancement Proposal 8 (PEP 8) has had numerous revisions since it was released in 2001.
  • You must choose a framework that is best for your purposes: be they for your benefit or the benefit of others.
  • It is also important to remain consistent (and not consistently inconsistent)!
  • Style guidelines include advice for file naming, variable naming, use of comments, and whitespace and bracketing.

The following are links to existing style guides that may be of use when deciding how to format your code:



However, to get started quickly, the following sections present some advice for code style.

File Naming

  • The Centre for Open Science has some useful suggestions for the naming of files, particularly ensuring that they are readable for both humans and machines. This includes avoiding the use of wildcard characters (@£$%) and using underscores (“_”) to delimit information, and dashes (“-”) to conjunct information or spaces.

  • They also suggest dating or numbering files and avoiding words like FINAL (or FINAL-FINAL).
  • The dating suggestion is the long format YYYY-MM-DD, followed by the name of the file, and the version number.
  • This results in automatic, chronological order. For example:
data <- read.csv("2019-05-17_Turing-Way_Book-Dash.csv")
  • For more details please see the chapter on File Naming

Versioning

  • An extra consideration to file-naming is versioning your software.
  • Using versioning guidelines will help avoid using words like _FINAL.R.
  • A typical convention is the MajorMinorPatch (or MajorMinorRevision) approach.
  • In this, your first attempt at a package or library might look like: my-package_1_0_0.py
  • This indicates that the software is in the unrevised/patched alpha stage (0) of the first major release.

Variable naming conventions

  • CamelCase
  • lowerCamelCase
  • Underscore_Methods
  • Mixed_Case_With_Underscores
  • lowercase

It is important to choose one style and stick to it:

ThisIs Because_SwitchingbetweenDifferentformats is.difficult to read.
Where_as if_you stick_to one_style, your_code will_be easier_to_follow!

Writing Human Readable Code

  • Writing clear, well commented, readable and re-usable code benefits not only you but the community (or audience) that you are developing it for.
  • This may be your lab, external collaborators, stakeholders, or you might be writing open source software for global distribution!
  • Whatever scale you work at, readability counts!

Line Length

  • There is some agreement on the length of the coding lines.
  • PEP8 suggests a maximum of 79 characters per line and 80 by the R style guide.
  • This means that the lines can easily fit on a screen, and multiple coding windows can be opened.
  • It is argued that if your line is any longer than this then your function is too complex and should be separated!

  • This is the crux of the Tidy method of R programming, which even has a special operator %>% which passes the previous object to the next function, so fewer characters are required:
recoded_melt_dat <- read_csv('~/files/2019-05-17_dat.csv') %>%
recode() %>%
melt() #We now have a recoded, melted dataframe called recoded_melt_dat

Commenting

  • Generally comment the “why” not “what”
  • The PEP8 guidelines have firm suggestions that block comments should be full sentences, have two spaces following a period, and follow a dated style guide (Strunk and White).
  • Inline comments should be used sparingly

Indentation

The R style guide suggests that lines should be separated:

by
  two spaces

And not

 a mixture
   of
    tabs
      and   spaces.

  • These are of course just guidelines, and you should choose elements that suit your coding style.
  • However, and again, it is important to ensure that you are consistent when collaborating, and can agree on a common style.
  • It could be useful to create a readme file describing your coding style so collaborators or contributors can follow your lead.

Coding Style Tools

  • As mentioned earlier, there are some automatic tools that you can use to lint your code to existing guidelines.
  • These range from plugins for IDEs packages that ‘spell-check’ your style, and scripts that automatically lint for you.

lintr

  • lintr is an R package that spell-checks your code using a variety of style guidelines. It can be installed from CRAN.
  • The function lint takes a filename as an argument and a list of ‘linters’ that it should check your code against.
  • These range from whitespace conventions to checking that curly brackets do not have their lines.
  • The output provides a list of markers with recommendations for changing the formatting of your code line-by-line, meaning it is best used early and often in your project.

An example of how the lintr output may look like for an input file with R code.

For more details, please visit the GitHub repository.

Autopep8

  • Autopep8 is a Python module that can be run from the terminal and automatically formats a file to pycodestyle (formerly called pep8) guidelines.
  • It is available on pypy and can be installed using pip.
# Install autopep8
$ pip install --upgrade autopep8

You can modify a file in place by running the following command:

$ autopep8 --in-place --aggressive --aggressive <filename>

Automating re-formatting

  • Linters and automatic reformatting saves time.
  • When working with others it’s possible to automate these processes further eg. using GitHub actions.
  • lintr R package has guidelines

Other resources:

Exercise:

  • For the remainder of today’s coding club session try to:
  • Install lintr in R Studio and try re-formatting your code following the Tidyverse R style guide.
  • You’re welcome to reformat one of my scripts here instead of your own code if you like.
  • Advanced: Set up GitHub actions to automate linting.